Generative Model-Based Text-To-Speech Synthesis